Thread: [C] - String Manipulation {Beginner Programmer}

  1. #1
    Registered User
    Join Date
    Nov 2004
    Posts
    93

    Exclamation [C] - String Manipulation {Beginner Programmer}

    Hello.
    I have an assignment where I need to clean up a string, removing tabs, new lines, leading and trailing whitespace, yet leave proper spaces between words.

    So far I have been able to remove newlines and tabs, and replace them with a single space.



    Code:
    char s[] = "  widget;Acme  \t  \n Co.;gear    induction\b\tdevice:   \n "

    Code:
    int cleanSpace(char s[])
    {
        int i, j;
        
        for(i = 0; s[i]; i++)
        {
              if(s[i] == '\t' || s[i] == '\n')
              {
                      s[i] = ' ';
              }
        }
    }
    This will now output,
    Code:
    "   widget;Acme       Co.;gear    inductio device:    ".
    Now that the newlines and tabs are gone, I need to remove the leading, trailing and multiple whitespace in the middle.

    I am confused as to how I can approach this.

    My professor insisted that I do not use ctype.h (isspace), that I work with my given function header, I do not use pointers or second buffers.

    He hinted that I should use 4 loops, with the first doing what I posted above. He hinted that I should find multiple whitespace, find the next non space and then copy everything down a spot.

    He insisted that I break it down, as I have above. Starting with removing new lines/tabs and replacing them with a space.

    Any help strongly appreciated.

    Thank you in advance.
    Last edited by INFERNO2K; 05-20-2005 at 01:56 PM.

  2. #2
    Gawking at stupidity
    Join Date
    Jul 2004
    Location
    Oregon, USA
    Posts
    3,218
    Well, there's more than one way to remove characters from a string. This is something that comes up a lot here. You basically have a couple of ways to do it:

    1) When you find an offending character, loop through the rest of the string, copying that character into the string index just before it. So like, str[index] = str[index + 1];.
    2) Create a second string and only copy non-offending characters from the original string into it.

    You can use one of those tactics for your remaining tasks: removing leading, multiple, and trailing spaces.
    If you understand what you're doing, you're not learning anything.

  3. #3
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    Alternately, a simple state engine will allow you to solve this without looping through the string more than once. Consider the following pseudocode:
    Code:
    set currentIndex = 0
    set insertIndex = 0
    set state = State_Begin
    while not at end of string
        //get rid of other whitespace
        if string[currentIndex] is a type of whitespace other than a space
            replace it with a space
        //skip over all space characters until a non-space character
        if state is State_Begin
            if string[currentIndex] is not a space
                state = State_Normal
            end if
        end if
        //copy characters up to a space character, then switch states
        if state is State_Normal
            copy string[currentIndex] to string[insertIndex]
            increment insertIndex
            if string[currentIndex] is a space
                state = State_Begin
            end if
        end if
        increment currentIndex
    end while
    place the null character appropriately
    Last edited by pianorain; 05-20-2005 at 02:42 PM.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

  4. #4
    & the hat of GPL slaying Thantos's Avatar
    Join Date
    Sep 2001
    Posts
    5,681
    Kinda like what itsme said but a little different.

    You keep two index. One that indicates where to put the next valid character and one that tells you which one you are looking at. This keeps you from having to shuffle too much.

    You'll also need to keep a couple "flags"

    Lets take look at just removing the middle extra spaces:
    example:
    Code:
    char arr[] = "Hello    World";
    That has 4 spaces in between. Lets start a look by setting our two indexs, called valid and current, both to 0
    Code:
    valid = current = 0;
    Lets also keep a flag that lets us know if we've already copied a space over.
    Code:
    int copiedSpace = 0;
    Now we compare the current character to space
    Code:
    if ( arr[current] != ' ')
    if its not a space we need to copy it over to the the valid spot and increment valid
    Code:
    {
      arr[valid++] = arr[current];
      copiedSpace = 0; /* We didn't just copy a space over*/
    }
    . If it was a space we need to see if we've already copied over a space
    Code:
    else if ( copiedSpace == 0 )
    {
      arr[valid++] = ' ';
      copiedSpace = 1; /* Because we did just copy a space over*/
    }
    but no matter what we have to move on to the next character
    Code:
    current++;
    Do this until current gets to the end of the string and then throw a null character in at valid
    Code:
    arr[valid] = '\0';
    We a couple more status flags you can remove all the extra spaces in one pass

  5. #5
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    Quote Originally Posted by Thantos
    Do this until current gets to the end of the string and then throw a null character in at valid
    Code:
    arr[valid] = '\0';
    Careful...gotta remove all trailing spaces. If you just throw a null on the end and the original string had trailing spaces, then you'll have one trailing space at the end too.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

  6. #6
    & the hat of GPL slaying Thantos's Avatar
    Join Date
    Sep 2001
    Posts
    5,681
    Read my example string. I was dealing with a case of only spaces in the middle. Thats why AFTER that I made mention of needing further flags. And using my process you'd have at most one trailing space so its a simple if statement to deal with that.

  7. #7
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    Hence why reading is better than skimming. My bad.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

  8. #8
    & the hat of GPL slaying Thantos's Avatar
    Join Date
    Sep 2001
    Posts
    5,681

    BTW I just it at home and there isn't any need for additonal flags.
    To take care of leading spaces just initalize copiedSpace to 1
    Trailing spaces are taken care by looking to see if copiedSpace is 1 or 0 prior to appending the null and adjusting valid

  9. #9
    Registered User
    Join Date
    Nov 2004
    Posts
    93
    Oh man I am so confused.

    Can somebody test out this string (fixing up the whitespace and removal of \t's and \n's and tell me if strlen 39 is returned? If 39 is returned that means I pass the test in my test main();

    Code:
      widget;Acme  \t  \n Co.;gear    induction\b\tdevice:   \n

  10. #10
    & the hat of GPL slaying Thantos's Avatar
    Join Date
    Sep 2001
    Posts
    5,681
    Yeah I'm getting 39 also

  11. #11
    Registered User
    Join Date
    Nov 2004
    Posts
    93
    Ok, after fiddling around with this code. This is what I have produced.

    Code:
        int curChar, nextAvail;
     
            for (curChar = nextAvail = 0; s[curChar]; curChar++)
            {
                if (s[curChar] != ' ' && s[curChar] != '\t' && s[curChar] != '\n'
                && s[curChar] != '\b')
                {
                    s[nextAvail++] = s[curChar];
                }
                else
                {
                   if (nextAvail > 0 && s[nextAvail-1] != ' ')
                   {
                       s[nextAvail++] = ' ';
                   }
                }
            }
            
            s[nextAvail] = 0;
    
            return strlen(s);

    My run of the test main(); produces this
    Code:
    Testing your init function.
    
    Failed test eight, customer should contain:
    widget;Acme Co.;gear induction device:♂
    
    your string contained:
    widget;Acme Co.;gear induction device:
    What the heck is that funky symbol at the end of what it should contain, and what code do I need to fix this up?

  12. #12
    Registered User
    Join Date
    Nov 2004
    Posts
    93
    Ok I figured out that is a vertical tab.

    One test it wants a vertical tab, the next it doesnt

  13. #13
    Registered User
    Join Date
    Nov 2004
    Posts
    93

    Exclamation

    It would be best if you could see what I am talking about.

    I am posting my code and my assignments main.

  14. #14
    Registered User
    Join Date
    Nov 2004
    Posts
    93
    Please, can someone have a look and compile it to see the returned test error.

  15. #15
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Why don't you just tell us what the error is? Oh, and stop compiling as C++.

    Quzah.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 8
    Last Post: 04-25-2008, 02:45 PM
  2. String Manipulation problems -_-
    By Astra in forum C Programming
    Replies: 5
    Last Post: 12-13-2006, 05:48 PM
  3. Replies: 4
    Last Post: 03-03-2006, 02:11 AM
  4. Program using classes - keeps crashing
    By webren in forum C++ Programming
    Replies: 4
    Last Post: 09-16-2005, 03:58 PM
  5. can anyone see anything wrong with this code
    By occ0708 in forum C++ Programming
    Replies: 6
    Last Post: 12-07-2004, 12:47 PM